[v22.x] src: clamp WriteUtf8 capacity to INT_MAX in EncodeInto#62621
[v22.x] src: clamp WriteUtf8 capacity to INT_MAX in EncodeInto#62621semimikoh wants to merge 2 commits intonodejs:v22.x-stagingfrom
Conversation
In TextEncoder.encodeInto, the destination buffer's byte length is
read as a size_t but then implicitly narrowed to int when passed as
the capacity argument to v8::String::WriteUtf8. When the destination
view is larger than INT_MAX (2,147,483,647 bytes), the narrowing
conversion underflows to a negative value, V8 treats it as "no
capacity", and writes 0 bytes - returning { read: 0, written: 0 }
even though the buffer has plenty of room.
Clamp the capacity to INT_MAX before passing it to WriteUtf8. This
is sufficient because the source string in encodeInto is bounded in
practice and never requires more than INT_MAX bytes to encode; only
the destination view length can exceed INT_MAX.
This issue is already fixed on main and v24.x as a side effect of
PR nodejs#58070, which migrated to the non-deprecated WriteUtf8V2 method
whose capacity parameter is size_t. WriteUtf8V2 is not available in
v22.x's V8 version, so this minimal patch fixes only the EncodeInto
path instead of backporting the full migration.
Refs: nodejs#58070
Fixes: nodejs#62610
|
Please also ensure your commits are signed off. |
Signed-off-by: semimikoh <ejffjeosms@gmail.com>
Renegade334
left a comment
There was a problem hiding this comment.
Implementation LGTM, I think the tests could do with being simplified though.
| const bounded = encoder.encodeInto(source, | ||
| dest.subarray(offset, | ||
| offset + expected.length)); | ||
| assert.deepStrictEqual(bounded, { | ||
| read: source.length, | ||
| written: expected.length, | ||
| }); | ||
| assert.deepStrictEqual(dest.slice(offset, offset + expected.length), expected); |
There was a problem hiding this comment.
This test is unnecessary – it isn't testing passing a large typed array to encodeInto, and succeeds both before and after the change.
| if (e.code === 'ERR_MEMORY_ALLOCATION_FAILED' || | ||
| /Array buffer allocation failed/.test(e.message)) { | ||
| common.skip('insufficient space for Uint8Array allocation'); | ||
| } | ||
| throw e; |
There was a problem hiding this comment.
Checking for a specific error condition is unnecessary here – there is no other error that can occur when calling new TypedArray(n) with a safe integer.
| const large = encoder.encodeInto(source, dest.subarray(offset)); | ||
| assert.deepStrictEqual(large, { | ||
| read: source.length, | ||
| written: expected.length, | ||
| }); | ||
| assert.deepStrictEqual(dest.slice(offset, offset + expected.length), expected); |
There was a problem hiding this comment.
There's really no need to mess around with creating a typed array and then deriving a slightly shorter array that's still larger than INT_MAX. Just make the typed array with size 2 ** 31 at the start, and use it directly.
| const large = encoder.encodeInto(source, dest.subarray(offset)); | |
| assert.deepStrictEqual(large, { | |
| read: source.length, | |
| written: expected.length, | |
| }); | |
| assert.deepStrictEqual(dest.slice(offset, offset + expected.length), expected); | |
| const result = encoder.encodeInto(source, dest); | |
| assert.deepStrictEqual(result, { | |
| read: source.length, | |
| written: expected.length, | |
| }); | |
| assert.deepStrictEqual(dest.subarray(0, expected.length), expected); |
Backport notice
This is a v22.x-only patch, not a cherry-pick. The bug it fixes was
incidentally resolved on
main/ v24.x by #58070, which migratedEncodeIntofromv8::String::WriteUtf8toWriteUtf8V2. Theintent of #58070 was deprecation cleanup, not bug fixing.
WriteUtf8V2is not available in v22.x's bundled V8 version, sobackporting #58070 in full is not feasible. This PR instead applies
a minimal scoped fix to
EncodeIntoonly.Problem
In
TextEncoder.encodeInto, the destination buffer's byte lengthis read as
size_tbut then implicitly narrowed tointwhenpassed as the capacity argument to
v8::String::WriteUtf8. Whenthe destination view is larger than
INT_MAX(2,147,483,647bytes), the narrowing conversion underflows to a negative value.
V8 treats this as "no capacity available" and writes 0 bytes,
returning
{ read: 0, written: 0 }even though the buffer hasample space.
Reproduction on v22.x (from #62610):
```js
const _TE = new TextEncoder();
const s = "aÿ我𝑒";
const u8a = new Uint8Array(2429682061); // ~2.4 GB
const offset = 38928786;
_TE.encodeInto(s, u8a.subarray(offset));
// v22.x: { read: 0, written: 0 } ← bug
// v24.x: { read: 5, written: 10 } ← correct
_TE.encodeInto(s, u8a.subarray(offset, offset + 10));
// Both versions: { read: 5, written: 10 }
// (works because the bounded view is only 10 bytes, well within int range)
```
This is a silent data loss bug — no error is thrown, the caller
just gets a successful-looking return value with zero bytes written.
Fix
Clamp the capacity passed to
WriteUtf8toINT_MAXinEncodeInto. The source string inencodeIntois bounded inpractice and never requires more than
INT_MAXbytes to encode;only the destination view length can exceed
INT_MAX.Testing
Built locally on v22.x-staging (
v22.22.2-pre) and verified:{ read: 5, written: 10 }for both unbounded and bounded subarraystest/pummel/test-whatwg-encoding-encodeinto-large.js) passes with the patched binary. Follows existing pummel conventions: skips on 32-bit architectures viacommon.skipIf32Bits(), skips on insufficientos.totalmem(), and skips on allocation failuretest/parallel/test-whatwg-encoding*,test-textencoder*,test-textdecoder*— 11 tests passtest/parallel/test-buffer-*,test-string-decoder*— 69 tests passtest/pummel/test-whatwg-encoding*— passescpplint.pyon the modified file — cleanmake lint-js— cleanRefs: #58070
Fixes: #62610